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Article Info ABSTRACT 
Article history: Sleep apnea is a common sleep disorder that interferes with the breathing of a 
i person. During sleep, people can stop breathing for a moment that causes the 
Received Apr 10, 2018 body lack of oxygen that lasts for several seconds to minutes even until the 
Revised Jun 21, 2018 range of hours. If it happens for a long period, it can result in more serious 
Accepted Jul 10, 2018 diseases, e.g. high blood pressure, heart failure, stroke, diabetes, etc. Sleep 
apnea can be prevented by identifying the indication of sleep apnea itself 
Keyword: from ECG, EEG, or other signals to perform early prevention. The purpose 
of this study is to build a classification model to identify sleep disorders from 
ECG the Heart Rate Variability (HRV) features that can be obtained with 
Feature extraction Electrocardiogram (ECG) signals. In this study, HRV features were 
HRV processed using several classification methods, i.e. ANN, KNN, N-Bayes 
Sleep apnea and SVM linear Methods. The classification is performed using subject- 


specific scheme and subject-independent scheme. The simulation results 
show that the SVM method achieves higher accuracy other than three other 
methods in identifying sleep apnea. While, time domain features shows the 
most dominant performance among the HRV features. 
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1. INTRODUCTION 

Sleep is very important for the human body to perform optimally. During sleep, the body will form 
and regenerate cells, support brain function, and recharge the body's energy. For children and adolescents, 
sleep is required to help the growth process. Sleep can be divided into two phases, i.e. rapid eye movement 
(REM) and non-Rapid Eye Movement (NREM) where both are always repeated in sleep [1]. When a person 
does not experience normal REM and non-REM cycles, the body will experience various adverse effects 
such as fatigue, decreasing ability to concentrate, disrupted body metabolism, and so on [2] . 

There are various types of sleeping disorder, e.g. insomnia, narcolepsy, sleep apnea, parasomnia, 
hypersomnia, restless leg syndrome, etc [3]. Sleep apnea is a disturbance to the breathing process because the 
wall of the throat is relaxed and narrowed while sleeping [4]. While sleeping, the muscles of the throat 
become relaxed and weak. When the muscles are too weak and not treated immediately, it may cause 
constriction or even block the airways that potentially cause health problems, accidents, and premature death. 
There are 3 types of sleep apnea, i.e. obstructive sleep apnea caused by obstruction of the respiratory tract, 
central sleep apnea caused by the unstable respiratory control centers that result in the brain failing to signal 
the breathing muscles [4],[5], and mixed complex which is a combination of obstructive and central apnea 
Sleep apnea is very common, usually found more in men than in women. This condition can occur in patients 
of any age, but more common in middle-aged adults. Sleep apnea can be treated by knowing the symptoms or 
signs of sleep apnea, reducing risk factors and being discussed with the doctor for further action. Common 
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symptoms are snoring, others notice some breathing interruptions during sleep, sudden wake with shortness 
of breath, headache in the morning, insomnia, attention problems, irritability, and hypersomnia. Sleep apnea 
disorder can cause hypertension, high blood pressure, stroke, obesity, and diabetes. Sleep apnea could not be 
treated, but can be diminished by doing treatments such as behavioral therapy, positive pressure therapy, 
installation of oral breathing apparatus, and operation [6], [7]. 

There are some researches have been performed in sleep apnea identification. Almazaydeh, et al., 
performs obstructive sleep apnea detection using support vector machine (SVM) method [8]. The feature 
used in the paper are mean epoch, standard deviation epoch, NN50 (variant 1), NN50 (variant 2), pNN5O, 
etc. They used ECG signal database from physionet.org as their data set. On the other hand, Yilmaz, et.al. 
have proposed sleep stage and obstructive apneaic epoch classification using single-lead ECG in order to 
classify sleep stage and sleep apnea automatically using single-lead ECG [9]. In this research, Yilmaz 
perform data preprocessing and feature extraction before the classification. Another paper from Carolina 
Varon, et al., performs sleep apnea classification using four easily computable features, three generally 
known ones and a newly proposed feature [10]. They perform classification using least squares support 
vector machines (LS-SVM) with RBF kernel. Finally, Sani M. Isa, et all proposed the implementation of the 
principal component analysis (PCA) in sleep apnea identification in order to improve the the accuracy of 
classification process [11]. 

According to the study that have been performed in this area, it is interesting to conduct 
performance evaluation among well known classification method in the similar simulation condition. In this 
research we evaluate the performance of ANN, KNN, N-bayes, and support vector machine in classifying 
sleep apnea. Then, mostly in previous study, they perform classication for 2 class of sleep apnea. In this 
study, we perform classification of 2, 3, and 4 class of sleep apnea. In this research we also perform feature 
extraction with four techniques, i.e. time domain, geometrical with histogram, poincare and frequency 
domain. We evaluate which feature achieve the best performance for sleep apnea identification. In the end of 
study we expect to achieve the best accuracy of sleep apnea identification with machine learning method. We 
also perform two methods of classification, i.e. subject-specific scheme and subject-independent scheme. 
This is paper is organized as follow. Research methodology is explained in Section 2. The simulation results 
and discussion are explained in section 3, while the result of the paper is concluded in Section 4. 


2. RESEARCH METHOD 

This research focuses on HRV Features of ECG signaling to identify sleep apnea. We create 
classification model based on the data obtained from ECG signal. In this research, we identify the sleep 
apnea until 4 classes, i.e. non-sleep apnea, hypoapnea, obstructive apnea, and central apnea. Figure 1 shows 
the methodology flowchart of the research. The next subsection presents about the detail of every step in the 
flow chart, started from data collection, data pre-processing, feature extraction, classification model 
development and performance evaluation. 


Data Collection 


Data Pre-Processing 
Feature Extraction 


Classification Model Development 
Performance Evaluation 


Figure 1. Research methodology 


2.1. Data collection 

At this stage the data collected where the data used is MIT-BIH Polysomnographic Database (slpdb) 
[12] obtained from https://physionet.org. This data contains a collection of recordings of several 
physiological signals of objects during sleep, ie ECG, EEG, blood pressure and respiratory signals, and 
calibration constants, the length of the recording, age, gender, and weight. The data used comes from male 
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sleep records, aged 32-56 years old (average 43 years old), weight 89-152 kg (average 119 kg). The data 
consists of 18 recordings of 16 people who have different durations of measurement. Then, every 30 seconds 
of duration, it was made into 1 sample data that will proceed to the pre-processing phase of the data. The data 
is about the RR interval and Sleep apnea annotations. Table 1 show samples of data which had been divided 
into 30 seconds and as data used to obtain HRV features. Figure 2 shows one of the signals from the MIT- 
BIH Polysomnographic Database (slp01a) data to be separated every 30 seconds. 


Table 1. Sleep Apnea Datasets 


Duration Number of 
Data : 
(hour-minutes) samples 
slp0la 2H 240 
slp01b 3H 360 
slp02a 3H 360 
slp02b 2H15M 270 
slp03 6H 720 
slp04 6H 720 
slp14 6H 720 
slp16 6H 720 
slp32 5H20M 640 
slp37 5H50M 700 
slp41 6H30M 780 
slp45 6H20M 760 
slp48 6H20M 760 
slp59 4H 480 
slp60 5H55M 710 
slp61 6H10M 740 
slp66 3H40M 440 
slp67x 1H17M 154 
Total of Sample 10274 


ECG 


ecg 


Figure 2. Sample of ECG signal used for sleep apnea identification 


2.2. Data pre-processing 

The pre-processing stage is the disposal stage of ECG recording data (RR intervals and sleep apnea 
annotations) that are out of sync in order to obtain the valid data. In this stage, we remove data slp03 and 
slp60 because the data are out of sync. Then all data from the pre-processing results is entered into the 
formulas to get the HRV features. 


2.3. Feature extraction 

In this process, data from pre-processing result is processed to obtain HRV features. We employ 
four techniques in this paper, i.e. time domain, geometrical, poincare, and frequency domain [13], [14]. The 
features contain SDNN, RMSDD, NN50, pNN50, Mean RR, HRV Index, SD1, SD2, SD1 / SD2, TP, VLF, 
LF, HF, LF / HF, LFnorm, HFnorm. Time Domain is derived from ECG Signal Data in the form of RR 
interval, frequency Domain of total spectrum of each beat separated in index to generate PSD. There are 18 
features used in this research as shown by Table 2. We perform feature extraction using matlab. Figure 3 
shows the value of feature extraction with time domain method. Figure 4 shows the results of feature 
extraction with frequency domain techniques, while Figure 5 shows the feature extraction results with 
poincare techniques. 
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Table 2. HRV Features List 


No Feature Method Description Equation 
N 
1 
1 AVNN Time Domain The average of all NN (or RR) intervals, AVNN = D RR; 
j=l 
The standard deviation of all NN (or RR) 1 N 3 
2 SDNN Time Domain intervals SDNN = az) (RR - RR) 
j=1 


The square root of the average of the sum of 1 Ne 5 
3 RMSSD Time Domain the squares of differences between adjacent RMSSD = |—— >, (R Risa — RR;) 
RR intervals N-1 i= 


Standard deviation of differences between 


4 SDSD Tine: Domain adjacent RR intervals 7 
: : The count of adjacent NN (or RR) intervals _ 
5 NN50 Time Domain dilicencce halare more dha SOS Number of (RRj41 — RR;) > 50 


The division of NNSO by total of all RR 


NNSO 
pNN50 = 


i enn Hime Domain intervals minus one times 100 N-1 x 100 
HRV The total number of RR intervals divided by 
7 Triangular Geometrical the peak of histogram created from RR - 
Index intervals data with 7.8125 bin size 
2 The standard deviation of points 2_1 2 
8 So Hamgate perpendicular to the axis of line-of-identity SDI = 75D: SD 


a iati ints 1 
9 SD2 Poingare The standard deviation of points along the SD2? = 2SDNN? —-spsp?2 
axis of line-of-identity 2 
SD1SD2 en : , SD1 
10 Ratio Poincare Ratio of SD1 and SD2, SD1 SD2 Ratio = SD2 
11 S Poincare Area of ellipse S =m X SD1 x SD2 
12 TP Frequency Total power - 
Domain 
13 VLF Frequency Total power of 0 to 0.04 Hz - 
Domain 
14 LF Fregueney Total power of 0.04 to 0.15 Hz - 
Domain 
15 HF Frequency Total power of 0.15 to 0.4 Hz - 
Domain 
16 LFHE Frequency Ratio of LF and HF - 
Ratio Domain 
i LF 
17 LFnorm Bs ere Normalized LF (Poses 
omain TP — VLF 
] Frequency ; _ HF 
18 HFnorm Domain Normalized HF HFnorm = TP VIF 
Time Domain and Geometrical Result a a i aa 
r A A A FFT spectrum 
Variable Units Value An 
Mean RR* (ms) 923.6 
STD RR (SDNN) (ms) 44.1 = 0.03 
RMSSD (ms) 22.5 = 
NN50 (count) 4 = 0.02 
pNN5O (%) 3.2 2 
RR triangular index 8.000 UM 
° 04 0.2 03 0.4 05 
Figure 3. Result of time domain & geometrical ee nti A 
Frequency Peak Power Power Power 
features Band (Hz) (ms?) (%) (nu) 
VLF (0-0.04 Hz) 0.0391 863 29.8 
LF (0.04-0.15 Hz) 0.0508 1889 65.3 93.0 
HF (0.15-0.4 Hz) 0.1523 142 4.9 7.0 
Total 2894 
LF/HF 13.297 


Figure 4. Result of frequency domain techniques 
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Variable Units Value 
Poincare plot 

SD1 (ms) 16.2 
SD2 (ms) 60.2 
SD1/SD2 (ms) 0.269 


Poincare Plot 


1000 


950} 


(ms) 


900} 


net 


RR 


850+ « 


850 900 950 1000 
RR, (ms) 


Figure 5. Feature Extraction with poincare techniques 


2.4. Classification process 
In this paper, we evaluate following four classification methods in classifying sleep apnea. All of 

these method are the most well-known classification method [15], i.e.: 

a. Artificial Neural Network (ANN) is an algorithm where there is a network of small process units that are 
modeled or constructed based on the behavior of human neural networks [16]. The function of the neural 
network is the classification of patterns, pattern mapping obtained from inputs into new patterns of 
output, mapping of similar patterns, and problem solvers [17]. This ANN is commonly applied for noise 
identification, wave analysis, speed analysis, reservoir characterization and etc. 

b. The Naive Bayes Classifier is a simple probability classification which uses Bayes's theorem with high 
independency assumptions [18]. Since variables are assumed to be independent so that the variance of a 
variable in a class is required to determine the classification. The advantage of using this method is that it 
requires only a few amount of training data to determine the required estimation parameters in the 
classification process. 

c. K-Nearest Neighbor (K-NN) is one of a supervised algorithm where grouping of an object is classified by 
the majority of the nearest category. The goal is to classify new objects based on the attributes and 
samples of the data[19]. The K-NN algorithm works based on the minimum distance from the new data to 
the samples data to determine the K number of nearest neighbors. From here we obtain the majority value 
used as the predicted result of the new data. 

d. Support Vector Machine (SVM) is a technique for making predictions in both classification and 
regression cases. The intention of SVM is to find an optimal separating hyperplane (OSH), which is the 
largest margin between the two datasets. It can be found by maximizing the margin between the classes. 
Firstly, SVM transforms input data into a higher dimensional space by using a kernel function. Then, 
SVM constructs a linear OSH between the two classes in the transformed space. The nearest data vectors 
to the constructed line in the transformed space are called as the support vectors [10]. 


2.5. Performance evaluation 

The data is divided into 2: 70% training data and 30% data testing to build the model of the 
classification to be tested. The result data of the extract feature will be randomized to classify the ANN, 
KNN, N-Bayes and linear SVM classification methods. We perform the classification with rapidminer. The 
experiments are conducted several times for each of the following classes: 
a. 2 classes: Sleep apnea and non-sleep apnea 
b. 3 classes: non-sleep apnea/hypo-apnea/obstructive-apnea 
c. 4 classes: non-sleep apnea/hypo-apnea/obstructive-apnea/central-apnea 

We perform classification using subject-specific scheme and subject-independent scheme [20]. The 
difference between them is about the selection of training and testing sets. In subject-specific scheme, the 
training and testing sets are selected from the same record before being inputted to classifier model. On the 
other hand, in subject-independent scheme, the training and testing sets are combined from all records. The 
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subject independent scheme is more practical [20]. However subject specific scheme more useful for multi- 
night investigations and evaluating the importance of each feature for sleep staging. 


3. RESULTS AND DISCUSSION 

Firstly, we perform classification of sleep apnea for 2 classes of classification. As mentioned in 
Section 2, we evaluate the performance of 4 classification method, i.e. ANN, KNN, N-Bayes, linear SVM. 
Table 3 shows the result of 2 classes of classification with subject-specific scheme. Firstly, we calculate the 
accuracy for every subject (total 16 subjects used in this experiment) and perform mean operation at the end 
to compute the average accuracy. Table 3 shows that the SVM linear shows the most superior performance 
among them. SVM linear achieves 75.8% accuracy, almost 3% more than the second one, i.e. ANN. 


Table 3. Accuracy for 2 Classes Classification with Subject-specific Scheme 


Data Method 
ANN KNN N-Bayes SVM (linear) 
Slp0la 77.78% 771,18% 76,39% 83,33% 
Slp01b 86,11% 79,63% 87,96% 86,11% 
Slp02a 82,41% 69,44% 64,81% 82,41% 
Slp02b 85,00% 80,00% 72,50% 91,25% 
Slp03 80,00% 64,29% 81,90% 76,67% 
Slp04 81,94% 71,30% 84,72% 85,65% 
Slp14 71,03% 61,21% 67,29% 72,90% 
Slp16 62,02% 53,37% 66,35% 64,90% 
Slp32 82,81% 73,96% 72,40% 83,33% 
Slp37 84,69% 79,90% 77,99% 85,17% 
Slp48 73,25% 62,72% 71,05% 73,68% 
Slp59 56,72% 49,64% 47,45% 48,18% 
Slp60 65,24% 56,67% 60,48% 63,33% 
Slp61 60,19% 60,65% 73,15% 68,52% 
Slp66 68,94% 64,39% 75,76% 81,06% 
Slp67x 65,22% 47,83% 73,91% 67,39% 
Mean 73,96% 65,80% 72,13% 75,87% 


Figure 6 and Figure 7 shows the result of classification for 2, 3, and 4 classes with subject-specific 
scheme and subject-independent scheme, respectively. We can observe in Figure 6 that SVM achieves the 
best performance among all method with 75.87%, 73.58%, and 71.58% for 2, 3, and 4 classses, respectively. 
While ANN becomes the second best with performance lower about 1-2% compared to the SVM. 
Interestingly, as shown by Figure 7, it shows that the performance of ANN slightly better than SVM for 3 
and 4 classes classification. Based on this result it shows that in overall, for subject-independent scheme and 
subject-specific scheme, SVM shows the best performance among them. The reason maybe due to the nature 
of SVM that performs data selection before doing classification. There it results in better accuracy 
performance. However, the anomaly in subject independent scheme with 3 and 4 classes classification shows 
the accuracy result sometimes also influenced by the distribution of data training and testing. 

From Figure 6 and Figure 7, we can also evaluate how good the generalization capability of each 
method is. For example, these results shows that ANN algorithm has good generalization capability because 
the decrease of accuracy between subject-specific and subject independent scheme is the lowest and most 
constant among all method, i.e. around 4% for 2, 3, 4 classes of classification. SVM eventhough shows the 
best performance in overall as explained in the previous paragraph, it suffers 8% decrease of accuracy in 3 
and 4 classes of classification. While N-Bayes shows the worst generalization capability by suffers almost 
20% of decrease in 4 classes of classification accuracy. 

On the other experiments, we evaluate which features extraction techniques has better performance 
for classification in every classification method. The result is shown by Figure 8. The time domain features 
shows dominant performance with ANN and SVM method. While, poin care techniques and frequency- 
domain techniques shows the best performance with N-bayes method and KNN method, respectively. This 
results means every feature extraction method has different performance in every classification method. 
Howevery, for sleep apnea case, The time domain features shows the best performances in overall. 
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Figure 8. Feature extraction method comparison 


Finally, we perform a comparison with the existing paper and can be seen through Table 4. The 
comparison is performed for classification with 2 classes. Table 4 shows that our results outperforms the 
result of other work by Tilmaz and Erazo with an accuracy of 75.89%. For the future, the sleep apnea 
classification model that uses the HRV feature of ECG signal for time domain, poincare, geometrical, and 
frequency domain can produce better accuracy. Then it can produce real-time sleep aids from the ECG signal 
results when a person sleeps. 


Table 4. Comparison with other Works 


Author Year Approach Accuracy Result 
Yilmaz [21] 2010 KNN, QDA, SVM 74.4% 
Erazo [22] 2014 ANN & SVM 55.94% 
Proposed 2017 ANN & SVM Linear 74.85% 


4. CONCLUSION 

In this paper, we have performed evaluation of sleep apnea identification using HRV features of 
ECG signal. We have performed comparison of which classification method performs better and what HRV 
features shows the best performance for sleep apnea identification. We perform the classification for 2, 3, and 
4 classes of sleep apnea. Our simulation results shows that the linear SVM achieves the best accuracy 
compared to the three other methods. While, the ANN method shows the best generalization capability 
among all method. On the other results, the time domain features shows the most dominant performance 
among the HRV features. At the end, we shows how our works shows slightly better performance compared 
to the other references. For the future research, we are going to employ classification method developed in 
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this paper into sleep apnea monitoring prototype. We are going to build the portable and contactless 
prototype that we hope can help people to do self monitoring of sleep apnea symptoms at home. 
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